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@ Speech recognftion system for an automotive vehicle. 

@ A speech recognition device can record or recognize 
spoken instruction phrases reliably even when engine noise 
increases high vwthin the passenger compartment When the 
engine begins to operate, the gain of an amplifier of a speech 
input section (6) or the threshold level off a voice detector f?) 
is so svkf itched as to reduce the sensitivity to spoken 
instructions. As a result, the driver must necessarily utter a 
spoken instruction in a louder voice and thus the proportion 
of noise level to spoken instruction signal level is reduced. 
The system comprises engine operation detecting means 
(15. 16, 17. 6). an analog swHch. two amplifiers or two 
multipliers, etc in addition to the conventional speech 
recognizer. 
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SPEECH RECOGNITION SYSTEM FOR AN AOTOMOTIVE VEHICLE 

BACKGROUND OF THE INVENTION 

Field of the Invention 

The present invention relates generally to a 
speech recognition system for an automotive vehicle, and 
more particularly to a speech recognition system by which 
driver's spoken instructions can be reliably recorded or 
recognized even when engine noise increases within the 
passenger compartment after the vehicle engine begins to 
operate. 

Description of the Prior Art 

There is a well-known speech recognizer which can 
activate various actuators in response to human spoken 
instructions. When this speech recognizer is mounted on a 
vehicle, the headlight, for instance, can be turned on or 
off in response to spoken instructions such as "Headlight 
on" or "Headlight off". Such a speech recognizer usually 
can recognize various spoken instructions in order to 
control various actuators; however , there are some problems 
involved in applying this system to an automotive vehicle. 

Usually, the speech recognizer is used in a 
relatively quiet environment; however, the speech 
recognition system for an automotive vehicle is usually 
used within a relatively noisy passenger compartment and 
additionally the noise fluctuates intensely therewithin. 
Therefore, one of the major problems is how to cope with 
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erroneous spoken phrase recordings or recognitions caused 
by fluctuating engine noise within the passenger 
compartment. 

In the prior-art speech recognizer, since a 
5 spoken instruction signal including noise is always 
amplified on a constant gain factor, when the noise level 
within the passenger compartment increases high especially 
when the engine begins to operate and therefore the engine 
noise is inputted to the speech recognizer at random, the 

10 noise mixed with the spoken instruction signal at a 
relatively high ratio is also amplified together with the 
spoken instruction signal, thus causing a problem in that 
the spoken instruction cannot be recognized reliably or is 
recognized erroneously to operate a wrong vehicle device 

15 actuator . 

Furthermore, in order to distinguish a spoken 
instruction from noise, conventionally there is provided a 
voice detector in the speech recognizer, by which the start 
and the end of a spoken instruction are determined by 

20 detecting whether the magnitude of a spoken instruction 
signal exceeds a predetermined reference threshold voltage 
level for a predetermined period of time or whether the 
magnitude of the spoken instruction signal drops below the 
predetermined reference threshold voltage level for 

25 another predetermined period of time, respectively. 

In the prior-art speech recognizer, however, 
since the reference threshold voltage level is fixed, when 
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noise signal level is high, for instance, when the vehicle 
is running and therefore the noise level exceeds the 
reference threshold voltage level for a long time, there 
exists a problem in that the voice detector can erroneously 
5 consider this state to represent the beginning of a spoken 
instruction. In other words, the prior-art speech 
recognizer is prone to erroneous recognition due to intense 
noise within the passenger compartment. 

A more detailed description of a typical prior- 

10 art speech recognizer and a typical prior-art voice 
detector will be made with reference to the attached 
drawings in conjunction with the present invention under 
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS. 
SUMMARY OF TEE INVENTION 

15 With these problems in mind, therefore, it is the 

primary object of the present invention to provide a speech 
recognition system for an automotive vehicle which can 
record or recognize spoken instruction phrases reliably 
even when engine noise increases high within the passenger 

20 compartment. In more detail, when the engine begins to 
operate, the gain of an amplifier in the speech input 
section or the threshold level of the voice detector 
section is so switched as to reduce the sensitivity to 
spoken instructions. As a result, the driver must 

25 necessarily utter a spoken instruction in a louder voice 
and thus the proportion of noise level to spoken 
instruction signal level is reduced. 
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TO achieve the above mentioned object, the speech 
recognition system for an automotive vehicle according to 
the present invention comprises an engine operation . 
detector or a speed sensor, an analog switch, two 
amplifiers having different gain factors respectively or 
two multipliers having different multiplication ratio 
respectively, etc., in addition to or in place of the 
elements or sections of the conventional speech recognizer. 
BRIEF DESCRIPTION OF THE DRAWINGS 

The features and advantages of the speech 
recognition system for an automotive vehicle according to 
the present invention will be more clearly appreciated from 
the following description taken in conjunction with the 
accompanying drawings in which like reference numerals 
designate corresponding elements or sections throughout 
the drawings and in which; 

Fig. 1 is a schematic block diagram of a typical 
prior- art speech recognizer for assistance in explaining 

- the operations -thereof ; 

Fig. 2 is a schematic block diagram of a detailed 
portion of the voice detecting means of the prior- art 
speech recognizer shown in Fig. 1; 

Fig. 3(A) is a graphical representation of the 
waveforms of a spoken -instruction_signal including noise as 
measured at point (A) in Fig. 2; 

Fig, 3(B) is a graphical representation of the 
waveforms of the spoken instruction" signal including noise 
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and a reference threshold voltage level as measured at 

point (B) in Fig- 2; 

Fig. 3(C) is a graphical representation of the 
waveform of the spoken instruction pulse signal as measured 

at point (C) in Fig. 2; 

Fig, 3(D) is a graphical representation of the 

waveform of the spoken instruction start/end signal as 

measured at point (D) in Fig. 2; 

Fig. 4 is a schematic block diagram of a first 

embodiment of the speech recognition systesn according to 

the present invention, in which only an essential portion 

of the system is shown together with an engine operation 

detector and in which two amplifiers are switched in the 

speech input section; 

Fig, 5 is a schematic block diagram of a second 

enbodiment of the speech recognition system according to 
the present invention, in which only an essential portion 

of the system is shown and in which two feedback resistors 
are switched in the speech input section; and 

Fig. 6 is a schematic block diagram of a third 
CTibodiment of the speech recognition system according to 
the present invention, in which only an essential portion 
of the system is shown together with an engine operation 
detector and in which two multipliers are switched in the 
speech input section. 

DETAILED DESCRIPTIOH OF THE PREFERRED EMBODIMENTS 

To facilitate understanding of the present 
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invention, a brief reference will be made to the pri-nciple 
or operation of a typical prior-art speech recognizer, with 
reference to Fig. 1. 

Fig. 1 shows a schematic block diagram of a 
typical speech recognizer 100, To use the speech 
recognizer, the user must first record a plurality of 
predetermined spoken instructions. Specifically, in this 
spoken instruction recording mode (reference mode), the 
user first depresses a record switch 1 disposed ne^r the 
user. When the record switch 1 is depressed, a switch 
input interface 4 detects the depression of the record 
switch 2 and outputs a signal to a controller 5 via a wire 
4a^ I-n-response to this signal, the controller 5 outputs a 
recording mode command signal to other sections in order to 
preset the entire speech recognizer to the recording mode. 
In the spoken instruction recording mode, when the user 
says a phrase to be used as a spoken instruction, such as 
"open doors", near a microphone 2, the spoken phrase is 
tr-ansduced""into a corr esponding -electric s ignal through the. 
microphone 2, amplified through a speech input interface 6 
consisting mainly of a spectrum-normalizing amplifier, 
smoothed through a root-raean-squar e (RMS) smoother 15 
including a rectifier and a smoother, and finally inputted 
to a voice detector ?• 

The spectrum-normalizing amplifier amplifies the 
input at different gain levels at different frequencies, so 
as to adjust the naturally frequency-dependent power 
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spectrum of human speech to a more nearly flat power 
^spectrum. This voice detector 7 detects whether or not the 
magnitude of the spoken phrase signal exceeds a pre- 
determined level for a predetermined period of time (150 to 
250 ms) in order to recognize the start of the spoken 
phrase input signal ajid whether o r not the ma gni tude of the 
signal drops below a predetermined level for a pre- 
determined period of time (about 300 ms) in order to 
'Y^recognize the end of the signal. Upon detection of the 
start of the signal, this voice detector 7 outputs another 
recording mode command signal to the controller 5. In 
response to this command signal, the controller 5 activates 
a group of bandpass filters 8, so that the spoken phrase 
signal from the microphone 2 is divided into a number of 
predetermined frequency bands. Given to a parameter 
extraction section 9r the frequency-divided spoken phrase 
signals are squared or rectified therein in order to derive 
the voice power spectrum across the frequency bands and 
then converted into corresponding digital time-series 
matrix-phonetic pattern data (explained later). These data 
are then stored in a memory unit 10. In this case, 
however, since the speech recognizer is set to the spoken 
instruction recording mode by the depression of the record 
switch 1, the time-series matrix-phonetic pattern data are 
transferred to a reference pattern memory unit 11 and 
stored therein as reference data for use in recognizing the 
speech instructions. ^ 

- 7 - 
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After having recorded the reference spoken 
instructions, the user can input speech instructions, such 
as "open doors", to the speech recognizer through the 
microphone 2 while depressing a recognition switch 3. 
5 When this recognition switch 3 is depressed, the 

switch input interface 4 detects the depression of the 
recognition switch 3 and outputs a signal to the controller 
5 via a wire 4 b* In response to this signal, the 
controller 5 outputs a recognition mode command signal to 

10 other sections in order to preset the entire speech 
recognizer to. the recognition mode. In this spoken phrase 
recognition mode, when the user says an instruction phrase 
similar to the one recorded previously near the microphone 
2 and when the voice detector 7 outputs a signal, the 

15 spoken instruction is transduced into a corresponding 
electric signal through the microphone 2, amplified through 
the speech input interface 6, filtered and divided into 
voice power spectra across the frequency bands through the 
band pass filters 8, squared or rectified and further 

20 converted into corresponding digital time-series matrix- 
phonetic pattern data through the parameter extraction 
section 9, and then stored in the memory unit 10, in the 
same manner as in the recording mode. 

Next, the time-series matrix-phonetic pattern 

25 data stored in the memory unit 10 in the recognition mode 
are sequentially compared with the time-series matrix- 
phonetic pattern data stored in the reference pattern 
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memory unit 11 in the recording mode by a resemblance 
comparator 12. The resemblance comparator 12 calculates 
the level of correlation of the inputted speech instruction 
to the reference speech instruction after time normal- 
5 ization and level normalization to compensate for variable 
speaking rate (because the same person might speak quickly 
and loudly at one time but slowly and in a whisper at some 
other time). The correlation factor is usually obtained by 
calculating the Tchebycheff distance (explained later) 
10 between recognition-mode time-series matrix-phonetic 
pattern data and recording-mode time-series matrix- 
phonetic pattern data. The correlation factor calculated 
by the resemblance comparator 12 is next given to a 
resemblance determination section 13 to determine whether 
15 or not the calculated values lie within a predetermined 
range, that is, to evaluate their cross-correlation. If 
within the range, a command signal, indicating that a 
recognition-mode spoken instruction having adequate 
resemblance to one of the recorded instruction phrases, is 
20 outputted to one of actuators 14 in order to open the 
vehicle doors, for instance. The above-mentioned 

operations are all executed in accordance with command 
signals outputted from the controller 5. 

Description has been made hereinabove of the case 
25 where the speech recognizer 100 comprises various discrete 
elements or sections; however, it is of course possible to 
embody the speech recognizer 100 with a microcomputer 
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including a central processing unit# a read-only memory, a 
random-access memory, a clock oscillator, etc. in this 
case, the voice detector 7, the parameter extraction 
section 9, the memory 10, the reference pattern menory 11, 
the r es esnblance comparator 12 and the resemblance 
determination section 13 can all be incorporated within the 
microcomputer, executing the same or similar processes, 
calculations and/or operations as explained hereinabove. 

The digital time-series matrix-phonetic pattern 
data and the Tchebycheff distance are defined as follows: 

In the case where the number of the bandpass 
filters is four and the number of time-series increments 
for each is 32, the digital recording-mode time series 
matrix-phonetic pattern data can be expressed as 

f(l,l), f(l,2), f(l,3) f(l,32) 
f(2,l), f(2,2), f(2,3) f(2,32) 
f(3,l), f(3,2), f(3,3) f(3,32) 
f(4,l), f(4,2), f(4,3) f(4,32) 
where A designates a first recording-mode speech 
instruction (reference] (e.g. OPEN DOORS )t i denotes the 
filter index, and j denotes time-series increment index. 

If a first recognition-mode speech instruction 
(e.g. OPEN DOORS) is denoted by the character -B", the 
Tchebycheff distance can be obtained from the following 
expression: 



F{A) = f (i, j) = 



32 



Ir(A) - F(B)| = Z Z if.(i,j)- f^Cio)! 

i^l i=l " ^ 
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Fig- 2 shows in more detail the speech detection 
section of the voice detecting means of the prior-art 
speech recognizer shown in Fig. Ir which is closely 
relevant to the present invention. 

In the figure, a spoken phrase inputted via a 
microphone and transduced into a corresponding electric 
signal (100) first passes through the speech input 
interface 6. The interface 6 is mainly made up of a 
spectrum-normalizing amplifier by which the electric 
signal is amplified to a greater degree at higher 
frequencies. This is because speech sounds tend to be 
attenuated greatly in the higher frequency range. The 
waveform of the spoken instruction signal (200) including 
noise outputted from the spectrum-normalizing amplifier 6 
may appear as shown in Fig. 3(A). 

The amplified spoken instruction signal (200) is 
next applied to the bandpass filters 8 to begin the process 
of recognizing whether the signal is a correctly spoken 
instruction and to the RMS smoother 15, consisting mainly 
of a rectifier 15-1 and a smoother 15-2, to begin the 
process of detecting the start and end of the spoken 
phrase. The rectified and smoothed spoken instruction 
signal (400) may appear as shown in Fig. 3(B), in which T^ 
denotes a constant reference threshold voltage level. 

The smoothed signal (400) is then conducted to 
the voice detector 7 including a voltage level comparator 
7-1 and a pulse duration comparator 7-2. The voltage level 
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comparator 7-1 compares the voltage level of the smoothed 
signal with the predetermined reference threshold voltage 
level and outputs a H-voltage level pulse signal (600) 
only while the voltage level of the speech instruction 
signal exceeds the reference threshold level as depicted 
in Fig. 3(C). 

The ipulse duration comparator 7-2 compares the 
pulse width of the H-voltage level pulse signal (600) with 
a predetermined reference spoken instruction start time t^ 
and the pulse width of the L-voltage level pulse signal 
(600) with another predetermined reference end time t^ and 
outputs a H-voltage level signal (700) only when the 
H-voltage level pulse width exceeds the reference start 

time t and a L-voltage level signal (700) only when the 
s 

L-voltage level pulse width exceeds the reference end 

time t^. 
e 

To explain in more detail with reference to Figs. 
3(C) and (D), if the pulse width of the first H-voltage 
level pulse signal is labeled tj^, since t-^ is shorter than 
the reference start time tg, the pulse duration comparator 
7-2 outputs no H-voltage level signal. On the other hand, 
if the pulse width of the second H-voltage level pulse 
signal is labeled tj, since tj is longer than the reference 
start time t , the pulse duration comparator 7-2 outputs a 
H-voltage level signal, indicating the start of a spoken 
instruction. In this case, the H-voltage level start 
signal (700) from the pulse duration comparator 7-2 is 
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delayed by the reference start time tg after the actual 
start tiroe P of the spoken instruction. Thereafter, this 
H-voltage level start signal is outputted until the 
duration comparator 7-2 detects the end of speech 
instruction. 

next, when the H-voltage level pulse signal tj 
changes tea L-voltage level for a period of time t^, since 
the tj is shorter than the reference end time t^, the pulse 
duration comparator 7-2 outputs no L-voltage level signal, 
that is, duration comparator 7-2 sustains the H-voltage 

level signal. . 

Thereafter in this case, even if a third pulse 
signal having a pulse width t^ is outputted again from the 
voltage level comparator 7-1, since the pulse duration 
comparator 7-2 is still outputting a H-voltage level 
signal, the operation of the duration comparator 7-2 is not 
effected. 

Next, when the H-voltage level pulse signal t^ 
changes to a L-voltage level for a period of tiroe t^, since 
tg is longer than the reference end time t^, the pulse 
duration comparator 7-2 outputs a L-voltage level signal, 
indicating the end of speech instruction. In this case, 
the L-voltage level end signal from the duration comparator 
7-2 is delayed by the reference end time t^ after the 
actual end time of speech instruction. Thereafter, the 
end signal is outputted until the duration comparator 7-2 
detects the start of another speech instruction. 

- 13 - 
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In response to the H- voltage level sicnal from 
the duration comparator 7-2 as shown in Fig. 3{D) , the 
controller 5 outputs a command signal to activate a group 
of bandpass filters 8 and other sections to recognize the 
5 spoken instruction signal outputted from the spectrum- 
normalizing amplifier 6, as already explained. 

In the prior-art voice detecting means connects 
to the microphone as described above, since the reference 
threshold level in the voltage level compa'rator 7-1 is 
^° fixed at a predetermined level, the speech recognizer 
cannot cope well with the fluctuations of noise level 
within the passenger compartment, with the result that 
accurate detection of speech instruction start and end is 
comprised so that noise may be interpreted as attempts at 
speech and/or spoken instructions may be ignored. 

In view of the above description and with 
reference to the attached drawings, the embodiments of the 
speech recognition system for an automotive vehicle 
according to the present invention will be described 
20 hereinbelow* - 

Fig. 4 is a block diagram showing a first 
embodiment of the present invention. In brief summation of 
this embodiment, the gain factor of the speech Input 
interface is adjusted according to the engine operation; 
25 that is, the gain factor is reduced while the engine is 
operating* 

As in the conventional speech recogni-zer 100, 
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there are provided a record switch 1, a microphone 2, a 
recognition switch 3, a switch input interface 4, and a 
controller 5. In addition to these elements, there are 
provided first and second amplifiers 61 and 62 having first 
5 and second gains and in the speech input interface 6 
for amplifying and outputting the spoken instruction signal 
from the microphone 2, the first gain being determined 
to be higher than the second gain G^. The outputs of the 
amplifiers 61 and 62 are inputted to an analog switch 63. 
10 I'^s understood later, the fixed contact a of this analog 
switch 63 is switched to the contact b as shown in Fig. 4 
when the engine stops operating, but switched to the 
contact c when the engine is operating. Therefore, when 
the engine stops, the gain factor for the spoken 
15 instruction signal is high; when the engine is running, the 
gain factor for the spoken instruction signal is switched 
into being lower. 

On the other hand, there are provided an engine 
operation detector 15, an ignition relay 16 having a relay 
20 contact closed when the ignition switch is turned on, and 
an alternator 17, in order to detect the engine condition 
and to switch the analog switch 63. The engine operation 
detector 15 detects that the engine is operating and 
outputs a signal to the controller 5 when the ignition 
25 relay 16 is energized and also the alternator 17 outputs a 
signal. In response to this signal, the controller 5 sets 
the analog switch 63 to the amplifier 62 side. On the 
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other hand, when the ignition relay 16 is deenergiz^d and 
the alternator 17 outputs no signal, the engine operation 
detector 15 detects that the engine stops and outputs no 
signal to the controller 5. In response to this, the 
controller 5 sets the analog switch 63 to the amplifier 61 
side as shown. 

Further, the reason why the alternator output 
signal is given to the engine operation detector 15 is that 
the alternator output is indicative of the engine operation 
condition because the alternator outputs a signal when the 
ignition switch is set to the starter position and 
therefore the engine starts rotating. 

In the system according to the present invention, 
however, it is also possible to apply an output signal 
generated from a vehicle speed sensor 171 or a speedometer 
171 to the controller 5, because the sensor or meter can 
also represent whether or not the engine or vehicle is 
running. 

Next, the operation of the first enbodim^nt of 
Fig. 4 will be described. When a spoken instruction is 
uttered toward the microphone 2 with the record switch 1 or 
the recognition switch 3 turned on when the engine stops, 
since the engine operation detector 15 detects that the 
engine stops, the controller 5 sets the analog switch 63 to 
the amplifier 61 side. Therefore, the spoken instruction 
signal from the microphone 2 is amplified on the basis of 
the first gain G^^ preset in the first amplifier 61. 

- 16 - 



•07801 4 



10 



In contrast with this, in the state where the 
engine is operating and therefore the vehicle is running, 
since the engine operation detector 15 detects that the 
vehicle is running and outputs a signal, the controller 5 
switches the analog switch 63 to the second amplifier 62 
side. Therefore, the spoken instruction signal from the 
microphone 2 is amplified on the basis of the second gain 
lower than the first gain preset in the first 

amplifier 61. 

Since the second gain Gj is relatively low as 
compared with the first gain Gj^, in order to obtain a 
spoken instruction signal exceeding a predetermined level 
necessary for recording or recogni2ing, the driver must 
necessarily utter a spoken instruction toward the 
1^ microphone 2 in a louder voice as compared' with the case 
where the engine stops. Therefore, even if noise level 
within the passenger compartment is high due to engine 
operation, the level of a spoken instruction becomes 
naturally high in comparison with a rise in noise level. 
20 AS a result, the proportion of noise level to spoken 
instruction signal level is reduced, thus improving the 
recording or recognition rate of a spoken instruction in 
the speech recognition system. 

Fig. 5 is a block diagram showing a second 
25 embodiment of the present invention, in which a feedback 
resistor for the amplifier is switched according to the 
engine operation; that is, the gain factor is reduced while 



- 17 - 



•078014 

the engine is operating. 

In this second embodiment, only one amplifier 60 
is provided for the speech input interface 6; however, two 
feedback resistors and connected -to the input 

terminal of the amplifier 60 and selectively connected to 
the output terminal of the amplifier 60 via the analog 
switch 63 in response to the signal from the controller 5. 

In more detail, when the ignition relay 16 is 
Reenergized and the alternator 17 outputs no signal, the 
engine operation detector 15 detects that the engine stops 
and outputs no signal to the controller 5. 

Therefore, the analog switch 63 is set to the 
- -first resistor side-R^-r as. shown by a solid line in Fig. 5. 
The larger the feedback resistor, the higher the gain of 
the amplifier. Since the first resistor R^ is predeter- 
mined to be larger than the second resistor Rj* the gain 
factor G of the amplifier 60 becomes high when the resistor 
R^ is connected between input and output terminals thereof. 
Further, in this case, the gain factor is_determined as a 
function of the input resistor Ro and the feedback resistor 
R^ or Rj. When the analog switch 63 is set to the first 
resistor side R^, the spoken instruction signal from the 
microphone 2 is amplified on the basis of a higher gain* 

Accordingly, even if the driver utters a spoken 
instruction in a low voice, the inputted spoken instruction 
phrase can be recorded or recognized reliably. 

In contrast with this, where the engine is 
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operating, since the controller 5 sets the analog switch 63 
to the second resistor side which is smaller than R^, 
the spoken instruction is amplified on the basis of a lower 
gain. 

Accordingly, since the driver must necessarily 
utter a spoken instruction in a louder voice, even if noise 
level is high due to engine operation, the inputted spoken 
instruction phrase can be recorded or recognized reliably. 

In other words, by changing the gain factor of 
the amplifier in dependence upon selection of feedback 
registers, it is possible to reduce the proportion of noise 
level to spoken instruction signal level in the case where 
the engine is operating, in the same way as in the first 
embodiment. 

Fig* 6 is a block diagram shoving a third 
embodiment of the present invention, in which the reference 
threshold level of the voltage level comparator 7-1 in the 
voice detector is switched according to the engine 
operation, that is, the multiplication ratio is increased 
when the engine is operating. 

AS in the conventional speech recognizer 100, 
there are provided a record switch 1, a microphone 2, a 
recognition switch 3, a switch input interface 4, a 
controller, a RMS smoother (a first smoother), a voice 
detector etc. In addition to these elesnents, there are 
provided a second smoother 72, first and second multipliers 
73a and 73b, an analog switch 74 and a holding circuit 75. 
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In the same way as in the first -embodiment of 
Fig. 4, the output signal of the engine operation detector 
15 is given to the controller 5 so as to detect the engine 
conditions. The voice detector 7 for detecting the start 
5 point and the end point of a spoken instruction signal from 
the speech input interface 6 and for outputting a start 
command signal and an end command signal comprises a 
rectifier 15-1 for rectifying the spoken instruction 
signal, a first smoother 15-2 for smoothing the output 
10 signal from the rectifier 15-1 at a time constant of 10 to 
20 milliseconds and for outputting a DC voltage roughly 
corresponding to the spoken instruction, a second smoother 
72 for smoothing the output signal from the rectifier 15-1 
at a time constant of about one to second and for 
15 outputting a DC voltage roughly corresponding to noise 
included in the spoken signal, first and second multipliers 
73a and -73b for multiplying the output signal from the 
second smoother 72 at multiplication ratios of Kj^ and 
{Kj^<K2)f an analog switch 74 for selecting the output from 
20 the multipliers 73a and 73b in response to the signal from 
the engine operation detector 15, a holding circuit 75 for 
holding the output signal from either of two multipliers 
73a and 73b via the analog switch 74 when a start of a 
spoken instruction is detected, a level comparator 7-1 for 
25 comparing the DC voltage signal corresponding to the spoken 
instruction signal from the first smoother 7-1 with a 
reference signal e^ of the DC voltage corresponding to 
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noise level given via the holding circuit 75 and for 
outputting a spoken instruction start signal when the 
output level from the first smoother 15-2 exceeds the 
reference signal e^ , and a pulse duration comparator 7-2 
^ for outputting a spoken instruction start signal e^ 
indicative of presence of a spoken instruction signal to 
the controller 5 when the H-level output from the level 
comparator 7-1 is kept outputted, for instance, for more 
than 150 milliseconds and a spoken instruction end signal 

10 indicative of absence of a spoken instruction signal to 

the controller 5 when the output signal from the level 
comparator 7-1 drops to a L-level and is kept dropped for 
about 250 milliseconds. In this voice detector 7, the 
analog switch 74 is switched to the first multiplier 73a 

15 side when the engine operation detector 15 detects that the 
engine stops, as shown by a solid line, and to the second 
multiplier 73b side when the engine operation detector 15 
detects that the engine is operating, as shown by a broken 
line in Fig. 6. Further, the holding circuit 75 first 

20 outputs the signal obtained from the multiplier 73a or 73b 
via the analog switch 74 to the level comparator 7-1 as the 
reference signal e^; however, once the pulse duration 
comparator 7-2 outputs a start signal e^, since the signal 
from the multiplier 73a or 73b is held in response to this 

25 start signal e^ (i.e. holding signal), the holding circuit 
keeps outputting the held signal as the reference signal e^ 
to the level comparator 7-1, until an end signal e is 
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outputted to the holding circuit 75. 

The reason why such a holding circuit 75 as 
described above is additionally provided is as follows: 
unless there is provided the holding circuit 75, when the 
reference end threshold level increases, the smoothed 
signal (400) drops below the threshold level before the end 
of spoken instruction, thus resulting in an erroneous 
spoken instruction end detection. In other words, since 
the time constant of the second smoother 72 is larger than 
that of the first smoother 15-2, the reference threshold 
level increases gradually with a time delay as the smoothed 
spoken ins truction signal (400) increases gradually; that 
is, the timing of two signals does not match. 

Next, there will be described the operation of 
the third embodiment of Fig. €• 

When a spoken instruction is uttered toward the 
microphone with the record switch 1 or recognition switch 3 
turned on in the state where the engine stops, the engine 
operation detector 15 detects that the engine stops, and 
therefore the analog switch 74 is set to the first 
multiplier 73a side via the controller 5. The spoken 
instruction transduced into an electric signal through the 
microphone 2 is rectified through the rectifier 15-1 aft«r 
amplified by the speech input interface 6. The first 
smoother 15-2 outputs a DC voltage corresponding to the 
power component of the spoken instruction signal. On the 
other hand, the second smoother 72 applies a DC voltage 
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proportional to the power level of background noise 
included in the spoken instruction signal to the 
multipliers 73a and 73b. At this time/ since the analog 
switch 74 is closed to the first multiplier 73a side, the 
output signal from the second smoother 72 is taken out 
being multiplied by times through the first multiplier 
73a, and is given to the level comparator 7-1 via the 
analog switch 74 and the holding circuit 75 as the 
reference signal e^. The level comparator 7-1 outputs a 
H-level output signal when the output of the first smoother 
15-2 exceeds the reference signal e^. If this H-level 
output signal is kept outputted, for instance, for about 
150 milliseconds, a start signal e^ for spoken instruction 
recognition or recording is applied to the controller 5; 
the spoken instruction signal branched from the output of 
the speech input interface 6 is inputted to the bandpass 
filters 8; the spoken instruction is recorded or recognized 
by the same circuit sections as in the prior-art system. 
Further, when the pulse duration comparator 7-2 outputs the 
start signal e., the output signal of the first 
multiplier 73a is held by the holding circuit 75 and the 
reference signal e^ for the level comparator 7-1 is fixed. 
Next, when the input of the spoken instruction has been 
completed, the output of the level comparator 7-1 returns 
to a L-level. If this L-level state continues, for 
instance, for about 250 milliseconds, the pulse duration 
comparator 7-2 outputs an end signal e^ to the 
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controller 5. The controller 5 determines that the input 
of the spoken instruction has been completed and controls 
the entire system so as to begin to process the recording 
or recognition of the spoken instruction. 
5 On the other hand, when a spoken instruction is 

inputted in the same way in the state where the engine is 
operating, since the engine operation detector 15 detects 
the state where the vehicle is running, the analog 
switch 74 is set to the second multiplier 73b side. 
10 Therefore, the DC signal corresponding to background noise 
power level included in the spoken instruction and given 
from the second smoother 72 is multiplied at a multipli- 
cation ratio greater than and is applied to the level 
comparator 7^1 as a reference signal e^.. Therefore, since 
15 the level of the reference signal e^. is adjusted to a 
higher level, as compared with that obtained when the 
engine stops, in the level comparator 7-1, only when a 
spoken instruction exceeds this reference signal e^., that 
is, a relatively loud spoken instruction is inputted, the 
20 level comparator 7-1 generates a H-level output signal. 
Also, a loud voice makes clear the features of voi<:e 
parameters. Therefore, in the state where the engine is 
operating, that is, where the vehicle is running and 
therefore noise within the passenger compartment is high, 
25 unless a relatively loud spoken instruction is uttered 
toward the microphone 2, no recognition or recording is 
-- made. If a spoken instruction having a lar-ge energy is 
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Inputted, even when the ambient noise level is high, the 
proportion of noise component to the spoken instruction 
signal is sufficiently reduced, so that it is possible to 
record or recognize the spoken instruction more reliably. 

In the embodiments described above » the gain of 
the speech input interface or the multiplication ratio in 
the voice detector is switched digitally being classified 
into the state where the engine stops and the state where 
the engine is operating. However, it is also possible to 
adjust the gain or multiplication ratio analogically 
according to the magnitude of noise level. 

Description has been made hereinabove of the case 
where the speech recognition system according to the 
present invention comprises various discrete elements or 
sections; however, it is of course possible to embody the 
system with a microcomputer including a central processing 
unit, a rjead-only memory, a random-access memory, a clock 
oscillator, etc. In this case, the engine operation 
detector, the second smoother, the first and second 
multipliers, the analog switch, the holding circuit, etc. 
can all be incorporated within the microcomputer, executing 
the same or similar processes, calculations and/or 
operations as explained hereinabove. In such case, the 
microcomputer also executes various operations necessary 
for the speech recognizer in accordance with appropriate 
software stored in the read-only memory. 

As described above, in the speech recognition 
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system according to the present invention, since the 
amplifier gain in the speech input interface or the 
threshold level in the voice detector is so adjusted that 
the sensitivity to spoken instruction can be reduced and 
therefore since the driver must necessarily utter a spoken 
instruction in a louder voice to reduce the proportion of 
noise level to spoken instruction signal level, in the case 
where the engine is operating, even if noise level within 
the passenger compartment rises intensely, it is possible 
to improve reliability in recording or recognition rate of 
a spoken instruction in the speech recognition system. 

It will be understood by those skilled in the art 
that the -foregoing description is in terms of preferred 
embodiments of the present invention wherein various 
changes and modifications may be made without departing 
from the spirit and scope of the invention, as is set forth 
in the appended claims. 
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CLAIMS: 

1. A speech recognition system for an automotive 

vehicle for recording or recognizing a spoken instruction 

(2) 

received through a microphone and for activating a vehicle 
actuator in response to a recognized spoken instruction, 
characterized by: 

(a) mearPs'^^toi/^^ detecting whether or not an 
automotive vehicle engine is operating and outputting one 
of an engine operation signal and an engine stop signal; 

(b) a speech input and voice detection section (6) 

(2) 

connected to the microphone for amplifying a spoken 

instruction signal transduced through the microphone, 

detecting the start and end of a spoken instruction, and 

outputting an instruction start command signal and an 

instruction end command signal, respectively, in response 

to detection thereof, one of the gain factor in said speech 

input section and the threshold level in said voice 

detection section being so adjusted that the sensitivity to 

spoken instructions can be reduced and that the driver must 

necessarily utter a spoken instruction in a louder voice to 

reduce the proportion of noise level to spoken instruction 

signal level, when said engine operation detecting means 

outputs an engine operation signal; and 

(7) 

(c) a voice analysis section connected to said 
speech input and voice detection section and responsive to 
instruction start and instruction end command signals for 
analyzing the spoken instruction signal from said speech 

i ■ 
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input section, comparing the results of analysis vith 
predetermined reference values corresponding to at least 
one spoken instruction, and activating at least one 
actuator when the results of analysis match predetermined 
reference values associated with the actuator. 

2. A speech recognition system for an automotive 

vehicle as set forth in cliaim 1, wherein said engine 

operation detecting means are characterized by: 

(16) , ■ 

(a) an ignition relay switch closed when an 
ignition switch is turned on and for outputting an ignition 

signal; ^^^^ 

(b) an alternator for outputting an alternator 

signal when an engine is operating; and 

(c) an engine operation detectoP^-^onnected to 

(16) (17) 
said ignition relay switch and said alternator for 

outputting an engine operation signal in response to the 

ignition signal and the alternator signal. 



3^ A speech recognition system for an automotive 

vehicle as set forth in claim 1, wherein said engine 

(171) 

operation detecting means is a speed sensor. 

4^ A speech recognition system for an automotive 

vehicle as set forth in claim 1, wherein said engine 

(172) 

operation detecting means is a speedometer. 
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5, A speech recognition system for an automotive 

vehicle as set forth in claim 1, wherein said speech input 
section is characterized by: 

. (61) 

(a) a first amplifier connected to the 
microphone; 

(62) 

(b) a second amplifier connected to the 

microphone for amplifying the spoken instruction signal 

(G2) 

transduced via the microphone, the gain factor of which is 

smaller than that of said first amplifier; and 

(63) 

(c) an analog switch for connecting said first 
amplifier to said voice detection section in response to 
the engine stop signal from said engine operation detecting 
means and for connecting said second amplifier to said 
voice detection section in response to the engine operation 
signal from said engine operation detecting means. 



6. A speech recognition system for an automotive 

vehicle as set forth in claim 1, wherein said speech input 

section (6) is characterized by: 

(60) 

(a) an amplifier connected between the 
(2) 

microphone and said voice detection section for amplifying 
the spoken instruction signal transduced via the 
microphone; 

(b) a first feedback resistor (R^ connected to 
the input terminal of said amplifier f^' 

(c) a second feedback resistor connected to 

(60) 

the input terminal of said amplifier, the resistance value 
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of which is smaller than that of said first feedback 
resistor; and 

( o3 J 

(d\ an analog switch for connecting said first 

<60) 

feedback resistor (R^^) to the output of said amplifier in 
response to the engine stop signal from said engine 
operation detecting means and for connecting said second 
feedback resistor (Rj) to the output of said amplifier in 
response to the engine operation signal from said engine 
operation detecting means. 

7^ A speech recognition system for an automotive 

vehicle as set forth in claim 1, wherein said voice 
anylysis . section is characterized by: 

(a) a first smoother^ ^connected to said speech 

input section for smoothing the spoken instruction signal 

. (6) 

amplified via said speech input section; 

(72) . 

(b) a second smoother connected to said speecn 

input section for smoothing the spoken instruction signal 
amplified via said speech input section, the time constant 
of which is larger than that of said first smoother; 

(c) a first jnultipliet'^c^nnected to said second 

(72) 
smoother; 

(73b) ^ ^ 

(d) a second multiplier connected to said 

second smoother?^ the multiplication ratio of which is 

greater than that of said first multiplier; 

(74) 

(e) an analog switch connected to said first 
multiplie?^or outputting the smoothed spoken instruction 
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• signal multiplied by said first multiplier in response to 
the engine stop signal from said engine operating detecting 
means and connected to said second multiplier for 
outputting the smoothed spoken instruction signal 
5 multiplied by said second multiplier ^^n response to the 
engine operation signal from said engine operating 
detecting means; 

. (75) 

(f) a holding cirucxt connected to said analog 
switch for holding the smoothed spoken instruction signal 

10 passed through said analog switch as a reference threshold 

level in response to an instruction start command signal; 

(7-1) 

(g) a voltage level comparator one input 

(15-2) 

terminal of which is connected to said first smoother and 

the other input terminal of which is connected to said 
(75) 

15 holding circuit, for comparing the spoken instruction 
signal voltage level smoothed by said first smoother with 
the reference threshold level outputted from said holding 
circuit and outputting a H-level signal when the signal 
voltage level smoothed by said first smoother exceeds the 

20 reference threshold level; and 

(7-2) 

(h) a pulse duration comparator connected to 
said voltage level comparator for comparing the pulse width 
of the H-level signal from said level comparator with a 
reference start time and outputting a spoken instruction 

25 start command signal when the pulse width exceeds the 

reference start time and for comparing the pulse width of 

(7-1) 

the L-level signal from said level comparator with a 
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reference end time and outputting a spoken instruction end 

command signal when the pulse width exceeds the referefice 

end time, the spoken instruction start command signal being 

(75) . - 

applied to said holding circuit as a holding Signal. 
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